SOFTCARDINALITY-CORE: Improving Text Overlap with Distributional Measures for Semantic Textual Similarity

نویسندگان

  • Sergio Jiménez
  • Claudia Jeanneth Becerra
  • Alexander F. Gelbukh
چکیده

The soft cardinality proved to be a very strong text-overlapping baseline for the task of semantic-textual-similarity (STS) obtaining the third place in SemEval-2012. This year, besides to the plain text-overlapping approach, two distributional word-similarity functions derived from the ukWack corpus were tested within the soft cardinality. These measures contributed to improve the performance of the text-overlapping approach. Further, these were combined with other features using regression obtaining positions 18th, 22th and 23th among the 90 participants systems in the official 2013 shared task ranking at *SEM. After the release of the gold standard anotations of the test data, we observed that the bare similarity measures, without the use of regression, would have obtained positions 6th, 7th and 8th. Moreover, the simple arithmetic average of these similarity measures would have been 4th (mean=0.5747). This paper describes the submitted system and the similarity measures that would obtained those better results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DLS$@$CU-CORE: A Simple Machine Learning Model of Semantic Textual Similarity

We present a system submitted in the Semantic Textual Similarity (STS) task at the Second Joint Conference on Lexical and Computational Semantics (*SEM 2013). Given two short text fragments, the goal of the system is to determine their semantic similarity. Our system makes use of three different measures of text similarity: word n-gram overlap, character n-gram overlap and semantic overlap. Usi...

متن کامل

Distributional semantic models for detection of textual entailment

We present our experiments on integrating and evaluating distributional semantics with the recognising textual entailment task (RTE). We consider entailment as semantic similarity between text and hypothesis coupled with additional heuristic, which can be either selecting the top scoring hypothesis or a pre-defined threshold. We show that a distributional model is particularly good at detecting...

متن کامل

Robust semantic text similarity using LSA, machine learning, and linguistic resources

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines Latent Semantic Analysis and machine learning augmented with data from se...

متن کامل

LIPN-CORE: Semantic Text Similarity using n-grams, WordNet, Syntactic Analysis, ESA and Information Retrieval based Features

This paper describes the system used by the LIPN team in the Semantic Textual Similarity task at SemEval 2013. It uses a support vector regression model, combining different text similarity measures that constitute the features. These measures include simple distances like Levenshtein edit distance, cosine, Named Entities overlap and more complex distances like Explicit Semantic Analysis, WordN...

متن کامل

Distributional Semantic Models for Clinical Text Applied to Health Record Summarization

As information systems in the health sector are becoming increasingly computerized, large amounts of care-related information are being stored electronically. In hospitals clinicians continuously document treatment and care given to patients in electronic health record (EHR) systems. Much of the information being documented is in the form of clinical notes, or narratives, containing primarily u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013